Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generalize first-write persistence and relax byte-zero read skipping #367

Merged
merged 4 commits into from
Mar 13, 2024

Conversation

jgraettinger
Copy link
Contributor

@jgraettinger jgraettinger commented Mar 12, 2024

Gazette has long had behavior of immediately persisting a fragment having Begin == 0. This has the effect of "dirtying" the fragment index, which ensures that other brokers understand the journal is not in a pristine state in the event of a loss of Etcd consistency.

Generalize this handling to immediately persist a Fragment if it's the first known write of the index, even if at an offset greater than zero. This better handles cases where a journal has been idle for a while, and all of its fragments have expired out via bucket policy, and suddenly new writes for the journal arrive.

Also slightly relax read offset skips to immediately skip-forward a read from (only) byte zero, to a first available persisted fragment. This lets such reads proceed much more quickly, rather than waiting for the six-hour delay that's used to guard against the possibility of raced reads vs writes to the fragment index during topology changes.

Testing:

  • New and updated unit tests
  • Manual testing using the soak crash test framework in kustomize/, which uses a local Minio as a stand-in for S3.
    • Confirmed that deleting all persisted fragments causes the open spool to immediately persist.
    • They continue to flush at flush-interval boundaries.
    • In all other cases, the current fragment is not persisted.
  • Soak crash tests are passing so far; I'll let them run overnight.

This change is Reviewable

… write

It's relatively common that a bucket policy will remove ALL fragments of
an idle journal. Later, the journal will suddenly have new writes, and
in this case it's strongly desired for a new dirtying fragment write to
occur as quickly as possible.
Seek forward a journal read at offset zero, to a first persisted
fragment of the journal.

This improves the ergonomics of a fairly common case where all data is
deleted from an idle journal (for example, because of bucket lifecycle),
and then new writes arrive, and a new reader is trying to read the
journal from "the beginning".

For this special case, we'd like reads to proceed forward as soon as a
persisted fragment is available in the index without waiting for the
full offsetJumpAgeThreshold.
@jgraettinger jgraettinger changed the title Johnny/offset skips generalize first-write persistence and relax byte-zero read skipping Mar 13, 2024
@jgraettinger jgraettinger marked this pull request as ready for review March 13, 2024 05:02
@jgraettinger jgraettinger requested a review from psFried March 13, 2024 05:02
Copy link
Contributor

@psFried psFried left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jgraettinger jgraettinger merged commit aeb5a47 into master Mar 13, 2024
1 check passed
@jgraettinger jgraettinger deleted the johnny/offset-skips branch March 13, 2024 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants